Generation of 3D data by deep neural network has been attracting increasingattention in the research community. The majority of extant works resort toregular representations such as volumetric grids or collection of images;however, these representations obscure the natural invariance of 3D shapesunder geometric transformations and also suffer from a number of other issues.In this paper we address the problem of 3D reconstruction from a single image,generating a straight-forward form of output -- point cloud coordinates. Alongwith this problem arises a unique and interesting issue, that the groundtruthshape for an input image may be ambiguous. Driven by this unorthodox outputform and the inherent ambiguity in groundtruth, we design architecture, lossfunction and learning paradigm that are novel and effective. Our final solutionis a conditional shape sampler, capable of predicting multiple plausible 3Dpoint clouds from an input image. In experiments not only can our systemoutperform state-of-the-art methods on single image based 3d reconstructionbenchmarks; but it also shows a strong performance for 3d shape completion andpromising ability in making multiple plausible predictions.
展开▼